All Questions
7 questions
1vote
2answers
2kviews
Learning from aggregated data
Online and in the literature there seems to be a general consensus that training a machine learning model using aggregated data is harder and/or fundamentally different from training on raw event data....
0votes
0answers
26views
Can Hip-Hop/music trend be estimated?
How can I determine how an "evolutionary" music album affects the development of its genre? Only two perspectives I can come up with: 1. The effect on the number of songs before and after(...
2votes
2answers
4kviews
What is the difference between a data-driven model and an empirical model?
Are they the same? Empirical models, per Wikipedia, are any kind of (computer) modelling based on empirical observations rather than on mathematically describable relationships of the system ...
12votes
2answers
15kviews
How to perform Logistic Regression with a large number of features?
I have a dataset with 330 samples and 27 features for each sample, with a binary class problem for Logistic Regression. According to the "rule if ten" I need at least 10 events for each feature to be ...
10votes
6answers
2kviews
What are some of the best practices for sharing data and models with colleagues?
As a data scientist who recently joined a new team, I wanted to ask the community how they share data and models among their colleagues. Currently I have to resort to storing data in some central ...
1vote
3answers
1kviews
Where can I get a comprehensive criminal dataset?
I want to create a machine learning model to predict the probability of Person committing crime in future. Given his/her past crime history, age, gender, race, employment history, family information (...
4votes
1answer
110views
Advise on making predictions given collection of dimensions and corresponding probabilities
I am a CS graduate but am very new to data science. I could use some expert advise/insight on a problem I am trying to solve. I've been through the titanic tutorial on gaggle.com which I think was ...